P-adic arithmetic coding
نویسندگان
چکیده
A new incremental algorithm for data compression is presented. For a sequence of input symbols algorithm incrementally constructs a p-adic integer number as an output. Decoding process starts with less significant part of a p-adic integer and incrementally reconstructs a sequence of input symbols. Algorithm is based on certain features of p-adic numbers and p-adic norm. p-adic coding algorithm may be considered as of generalization a popular compression technique – arithmetic coding algorithms. It is shown that for p = 2 the algorithm works as integer variant of arithmetic coding; for a special class of models it gives exactly the same codes as Huffman’s algorithm, for another special model and a specific alphabet it gives Golomb-Rice codes. “For more than forty years I've been speaking in prose without even knowing it!” Moliere. Le Bourgeois Gentilhomme. Introduction Arithmetic coding algorithm in its modern version was published in Communications of ACM in June 1987 [Witten], but the authors, Ian Witten, Radford Neal and John Cleary, referred to [Abrahamson] as to “the first reference to what was to become the method of arithmetic coding”. So we may say that it is known “for more than forty years”. The algorithm now is a common knowledge – it was published in numerous textbooks (see for example [Salomon, Sayood]), some reviews were published [Bodden, Said], Dr. Dobb’s Journal popularized it [Nelson], wiki [wiki] contains an article about it, a lot of sources could be found on web... So why one more paper on this subject and what is this “p-adic arithmetic”? Let go back to the original idea of arithmetic coding. In arithmetic coding a message is represented as a subinterval [b, e) of union semi interval [0, 1). (We will give all definitions later) When a new symbol s comes a new subinterval [b(s), e(s)) of [b, e) is constructed. Common method to calculate a new subinterval is to divide a current interval into |A| (A is an alphabet, |A|number of symbols) subintervals, each subinterval represents a symbol from A and has length proportional to probability of this symbol. For a new symbol s corresponding subinterval [b(s), e(s)) will be return by encoder. Thus encoding is a process of narrowing intervals (we will call them message intervals) starting from the union interval: [0, 1) ≡ [b0, e0), [b1, e1), [b2, e2), ... , [bt, et) where 0 = b0, ≤ b1 ≤ b2 ≤ ... ≤ bt 1 = e0, ≥ e 1 ≥ e 2 ≥ ... ≥ e t All bi and ei are real numbers. A last constructed subinterval may be used as a final output, or any point x from last subinterval and message length. But usually a special symbol EOM (End Of Message), which does not belong to the alphabet, is used as termination symbol of a message. In this case only a point x can be used as coding result. Decoding is also a process on narrowing intervals. It starts with union interval and a point x inside it. Decoder finds a symbol by dividing current intervals into |A| subintervals and finds the one that contains point x, say [b1, e1). Corresponding to this interval symbol s1 is pushed into an output buffer; [b1, e1) is used as a new current interval. And so on until EOM symbol is received.
منابع مشابه
A New Representation of the Rational Numbers for Fast Easy Arithmetic
A novel system for representing the rational numbers based on Hensel's p-adic arithmetic is proposed. The new scheme uses a compact variable-length encoding that may be viewed as a generalization of radix complement notation. It allows exact arithmetic, and approximate arithmetic under programmer control. It is superior to existing coding methods because the arithmetic operations take particula...
متن کاملFaster p-adic Feasibility for Certain Multivariate Sparse Polynomials
We present algorithms revealing new families of polynomials allowing sub-exponential detection of p-adic rational roots, relative to the sparse encoding. For instance, we show that the case of honest n-variate (n+ 1)-nomials is doable in NP and, for p exceeding the Newton polytope volume and not dividing any coefficient, in constant time. Furthermore, using the theory of linear forms in p-adic ...
متن کاملAnalysis Of Exact Solution Of Linear Equation Systems Over Rational Numbers By Parallel p-adic Arithmetic
We study and investigate the p-adic arithmetic along with analysis of exact solution of linear equation systems over rational numbers. Initially we study the basic concepts involving the p-adic numbers and why they form a better representation. After that we describe a parallel implementation of an algorithm for solving systems of linear equations over the field of rational number based on the ...
متن کاملSolving Univariate P-adic Constraints
We describe an algorithm for solving systems of univariate p-adic constraints. In analogy with univariate real constraints, we formalize univariate p-adic constraints as univariate polynomial equations and order comparisons between p-adic values of univariate polynomials. Systems of constraints are arbitrary boolean combinations of such constraints. Our method combines techniques of Presburger ...
متن کاملOn the Arithmetic of the Bc-system
For each prime p and each embedding σ of the multiplicative group of an algebraic closure of Fp as complex roots of unity, we construct a p-adic indecomposable representation πσ of the integral BC-system as additive endomorphisms of the big Witt ring of F̄p. The obtained representations are the p-adic analogues of the complex, extremal KMS∞ states of the BC-system. The role of the Riemann zeta f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/0704.0834 شماره
صفحات -
تاریخ انتشار 2007